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Dipeptide formation on engineered hybrid peptide syntlietases 

Sascha Doekel and Mohamed A Marahiel 



Background: Nonribosomal peptide synthetases (NRPSs) are modular 
'megaenzymes' that catalyze the assembly of a large number of bioactive 
peptides using the multiple carrier thiotemplate mechanism. The modules 
comprise specific domains that act as distinct units to catalyze specific 
reactions associated with substrate activation, modification and condensation. 
Such an arrangement of biosynthetic templates has evoked interest in 
engineering novel NRPSs. 

Results: We describe the design and construction of a set of dimodular hybrid 
NRPSs. By introducing domain fusions between adenylation and thiolation 
(PCP) domains we designed synthetic templates for dipeptide formation. The 
predicted dipeptides, as defined by the specificity and arrangement of the 
adenylation domains of the constructed templates, were synthesized in vitro. 
The effect of the intramolecular fusion was investigated by determining kinetic 
parameters for substrate adenylation and thiolation. The rate of dipeptide 
formation on the artificial NRPSs is similar to that of natural templates. 

Conclusions: Several new aspects concerning the tolerance of NRPSs to 
domain swaps can be deduced. By choosing the fusion site in the border region 
of adenylation and PCP domains we showed that the PCP domain exhibits no 
general substrate selectivity. There was no suggestion that selectivity of the 
condensation reaction was biased towards the donor amino acid, whereas at 
the acceptor position there was a size-determined selection. In addition, we 
demonstrated that a native elongation module can be converted to an initiation 
module for peptlde-bond formation. These results represent the first example of 
rational de novo synthesis of small peptides on engineered NRPSs. 
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Introduction 

Nonribosomal peptide synthesis is an alternative route of 
peptide synthesis carried out by large microbial multifunc- 
tional enzymes termed nonribosomal peptide synthetases 
(NRPSs) [1,2]. Molecular characterization of NRPS genes 
has revealed a modular organization in which each module 
includes a full complement of catalytic sites (domains) 
required for a single step of chain elongation [3]. Size and 
sequence of the nonribosomal peptide are determined by 
the number and colinear arrangement of modules. Pep- 
tides are synthesized through amino acyl adenylate inter- 
mediates (recognized and activated by the adenylation or 
A domain) [4] that are subsequently tethered to a 4'-phos- 
phopantetheinyl cofactor of the adjacent thiolation (T) 
domain (or peptidyl-carrier protein, PCP) [5], Thioesteri- 
fied substrates are concatenated via peptide-bond forma- 
don in a stepwise amino— >carboxy-terminal elongation 
reaction catalyzed by the condensation (C) domain, the 
third essential domain of a minimal module [6]. Structural 
diversity in this family of bioactive natural products can be 
further enhanced by substrate modifications such as 
epimerization [7], N-methylation [8] or heterocyclization 
[9]. These modifications are catalyzed by auxiliary 
domains. Release of the nascent peptide is accomplished 



by a thioesterase (Te)-Iike domain, found at the carboxyl 
terminus of most NRPSs [10]. The primary sequence, 
configuration of each a carbon center and the extent of 
modification of these products are therefore controlled by 
the linear sequence of catalytic domains [11], 

PGP domains are activated by post-translational transfer of 
a 4'-phosphopantetheine cofactor to a conserved serine 
residue, a reaction catalyzed by specialized coenzyme A 
(GoA)-dependent4'-phosphopantetheinyl transferases [12,13]. 

Vlany nonribosomal peptides have important agricultural 
or medical uses [14-16]. Examining the catalytic flexibil- 
ity of NRPSs towards noncognate substrates is of interest 
because these results could pave the way to new drugs. 
Insight into the rules that govern the interaction of the 
multidomain arrangements should allow the rational 
design of novel peptides by module and domain swap- 
ping. Some attempts have been made to re-design NRPSs 
//; vwo [17,18]. Biochemical characterization of distinct 
domain types />; vifro should provide a deeper mechanistic 
understanding of NRPSs [5,19,20]. Designing a model 
NRPS system to evaluate the feasibility of such manipula- 
tions should test our understanding of NRPS architecture. 
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In this paper we examine how NRPSs can be used to 
design novel artificial templates to produce small peptides 
de fwvo. To this end, a set of dimodular hybrid NRPSs was 
constructed on the basis of fusing A and PCP domains. 
These hybrid enzymes were shown to catalyze specific 
adenylation and thiolation reactions of substrate amino 
acids, a condensation reaction and a terminating release that 
yielded dipeptides of the predicted sequence. Biochemical 
studies investigating individual steps in dipeptide forma- 
tion revealed that the domain types (apart from A domains) 
are quite tolerant with respect to processing non-native sub- 
strates. This approach should encourage the rational design 
of small peptides on engineered NRPS templates [21,22]. 

Results and discussion 

Engineering dimodular hybrid NRPSs 

We designed and constructed three dimodular hybrid 
NRPSs (Figure lb). These constructs serve as a model 



system for addressing important questions about nonribo- 
somal peptide synthesis. The main determinant of NRPS 
substrate specificity is the A domain, which selectively rec- 
ognizes and adenylates a cognate acyl amino acid [23]. In 
contrast, little is known about the substrate selectivity of 
downstream domains such as the PCP and C domains and 
whether (and to what extent) domain organization can be 
exploited to obtain A-domain-swapped NRPSs that form 
peptides with the predicted sequences. To answer these 
questions we started with a dimodular NRPS with a native 
termination module (GATTe) and introduced a domain 
fusion between the A and PGP domains of module 1 (A-T; 
Figure lb; see the Materials and methods section for 
module definitions). This operation yielded a hybrid tem- 
plate [A„J(B,,Air[TGAL,,Tre](T^y,c6) (enzyme I; Table 1) 
that connects the native initiation module of the bacitracin 
synthetase 1 BacAl [9] with the carboxy-terminal module 
of the tyrocidine synthetase TycG [19] (for details of 
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(a) Organization of the bac operon of Bacillus licheniformis 
ATCC1 071 6 (top) and the tyc operon of Bacillus brevis ATCC81 85 
(bottom). Domain organization is depicted by boxes in different shadings. 
Three genes, bacA, bacB and faacC, encode the synthetases BacA, 
BacB and BacC that assemble the branched cyclic dodecapeptide 
bacitracin. Tyrocidine is synthesized by three peptide synthetases TycA, 
TycB and TycC, encoded by tycA, tycB and tycC. The domains used for 



the engineering of hybrid NRPSs are highlighted in color, 
(b) Engineering of dimodular hybrid NRPSs by domain fusion. Left: gene 
fragments corresponding to NRPS domains were amplified. Middle: 
fusion leads to artificial hybrid genes and their con'esponding hybrid 
dimodular NRPSs. Right: potential dipeptides formed by the hybrid 
templates, (c) Color/shading code of NRPS domain types. A circle in the 
condensation domain of BacA2 indicates a cyclization domain. 
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cloning, see the Materials and methods section). We pre- 
dicted that the dipeptide Ile-Leu would be formed on the 
basis of specificity of the A domains used (Figure la). A 
second construct [AphJ(XycB2)-rrcALe„Trre](T.y^6)(ei^zyme 
11; Table 1) was designed in an analogous manner 
(Figure lb). We also wanted to address the question of 
whether a native elongation module can be transformed 
into an initiation module. Here the native elongation 
module TycB2 was repositioned to serve as an initiation 
module. Phe-Leu was predicted to be the product on the 
basis of the arrangement and specificity of the A domains. 
In a third construct (Figure lb), a second fusion site was 
introduced between the A and PCP domains of module 2. 
The resulting hybrid [A„J(B^,.A,)-[TCAphJ(TycB2rt'l^l'^l(TycC6) 
(enzyme III; Table 1), a template for the production of the 
dipeptide Ile-Phe, converts a native elongation module 
into a termination module. The latter hybrid construct is a 
model for an engineered nonribosomal template for any 
dipeptide A-B that can be generated simply by fusing two 
modules with the specificities for A and B, respectively. 

We engineered the hybrid sites within the linker regions 
of the A and PCP domains. To define the optimal fusion 
site, we aligned the potential boundary region between 
the A and PGP domains, and identified a relatively uncon- 
served stretch of nine amino acids embedded within 
highly conserved residues (Figure 2). The same site has 
been previously shown to be susceptible to proteolytic 
digestion [24]. Moreover, the recently solved solution 
structure of the PCP domain of TycG3 [25] also supports 
this definition of a linker region, because the secondary 
fold of the PGP domain starts from a highly conserved 
tyrosine next to this linker region (Figure 2). We identi- 
fied no sequence conservation among the linker regions. It 
has been reported, however, that linker regions have a 
propensity to form defined secondary folds with a confor- 
mationally sensitive turn [26]. This turn is often repre- 
sented by a highly conserved proline residue that acts as a 
switch to coordinate or de-coordinate catalytic domains. 
Single residues among linker regions have, therefore. 



Table 1 

Hybrid dimodular NRPSs made in this study. 

' [A,J(Bgj^1j-[TCALeyTTe](TycC6) ^^8^^^^®® 

I' lAphel(Tyc82)-n"CALei,TTe]fTy,c6) ^^"^1®** 

I" [A„e](BacA1)-rrCAp,,](TycB2)-nTe]tTyeC6) ^^^S^ 

Domains used as segments are indicated with square brackets, and 
their names related to the genes from which they have been extracted 
(e.g. Tyc6). We include the proposed amino acid substrate of the 
A domain (e.g. He). Hybrid fusion sites are indicated with a dash. 

been shown to be invariant for the function of multi- 
domain proteins [27], 

Production of the hybrid proteins 

Hybrid enzymes (with a predicted mass of 215,000 Da) 
were overproduced in Escherichia coli BL21(p^J5p) as (His^)- 
tagged fusion proteins and purified using a Ni-agarose 
column. Typically 6 mg of protein was obtained from 1 1 
culture (see the Supplementary material section). The 
amount of III was considerably lower than that of I and II. 
Overproduction in K. coli BL21(p^5/>) yielded h Co- 
enzymes, resulting from co-expression with the 4'-phospho- 
pantetheinyl transferase gene gsp [28]. 

Adenylation reaction 

To test the substrate specificity of the engineered 
NRPS, ATP-PPj exchange reactions were performed [6]. 
As shown in Figure 3a, recombinant enzyme I activated 
the cognate amino acids isoleucine and leucine to the 
same level (compared with valine, which was only acti- 
vated to 30% the level of isoleucine or leucine; 
Figure 3a). The A domain of BacAl was shown to acti- 
vate only isoleucine [9]. The level of isoleucine activa- 
tion on the second module TycC6 is anticipated to be 
-30% (see the activation pattern of U). 



Figure 2 



Amino acid sequence alignment (using 
ClustalW and single-letter code) of the 
potential boundary region of the A and PCP 
domains used in this study. A poorly 
conserved stretch of nine amino acids was 
identified. This region is believed to represent 
a linker connecting the A and PCP domains. 
Boxed, the fusion site that alters the native 
sequence to LQ (introduction of a PsM site, in 
red) or VN (introduction of a Hpal site, in 
green), respectively. Numbers at the left and 
right indicate the position within the 
polypeptide chain. A1 0 is a highly conserved 
motif of A domains. 
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Figure 3 



<a) SH 




_1 CL 



0} Q. n n! 

£ ^ 1 5 

E 2 £ 

O -O S- 'F 

X z 



Chemistry & Biology 



Relative representation of ATP-PPj exchange 
reaction with the purified hybrid NRPSs. 
(a) I, (b) 11 and (c) III. The highest exchange 
rate was defined as 1 00%. Maximum labeling 
was 300,000 cpm in all three panels, and the 
highest activation rates reached 
-1 10,000-130,000 cpnn each. Black 
columns represent proteinogenic amino acids, 
white columns nonproteinogenic amino acids. 



Enzyme II activates tryptophan and phenylalanine (by 
TycBZ), as well as leucine and isoleucine (mainly by 
TycC6; Figure 3b). The activation of the donor amino 
acids tryptophan and phenylalanine was reduced com- 
pared with that of the acceptor amino acids. Activation of 
tryptophan by the A domain of TycB2 was predicted pre- 
viously [19] and explains the occurrence of tyrocidine 
analogs that have a Phe^Trp substitution in the third 
position (Figure la). 

As shown in Figure 3c, enzyme III also activated the four 
amino acids isoleucine and leucine (mainly by BacAl) 
and tryptophan and phenylalanine (by TycB2). In addi- 



tion, considerable activation of tyrosine was observed. 
The activation of tryptophan and phenylalanine by the 
TycB2 module was enhanced compared with II. Also 
remarkable was the observed high level of tyrosine activa- 
tion in III in contrast to the low activation by H. Both 
hybrid enzymes contain the same module TycB2 
(Figure 3b, c). This finding prompted us to investigate the 
substrate specificity of III and 11 in more detail. We 
found that the activation patterns were clearly different 
for the nonproteinogenic amino acids 5'-Hydroxy-Trp, 
Homo-Phe and 2'-Napthyl-Ala (Figure 3b,c). In general, 
III was less specific than II. The differences in substrate 
specificity are surprising because it has been proposed 
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Table 2 



Substrate affinity of the TycB2 A domain. 



Enzyme 


Substrate 
amino acid 


K^(mM) 




II 


Tryptophan 


0.04 


7.2x107 




Homo-Phe 


0.4 


3.9x106 




2'-Thienyl-Ala 


0.8 


4.5 X 107 


III 


Tryptophan 


<0.01 


>108 




Homo-Phe 


0.01 


6.9x108 




2'-ThienyI-Ala 


1.2 


1.4x107 



that A domains are autonomous units that retain their cat- 
alytic potential even when overproduced as distinct pro- 
teins [19]. Our results, however, provide evidence that 
the exact activation pattern of an A domain is a function 
of an active-site configuration involving adjacent domain 
interaction [29]. The importance of conformational 
changes in the process of adenylation and substrate 
release has been discussed recently [24]. The kinetic 
parameters for the adenylation reaction demonstrate that 



II and III have different and kcat'^m values for trypto- 
phan and homo-Phe, whereas the values for 2'-Thienyl- 
Ala were comparable for both proteins (Table 2). 

Substrate thiolation 

To monitor transfer of the adenylated amino acids to the 
PGP domain, the trichloracetic acid (TCA)-precipitation 
assay [6] using V'^C]- or [^H]-labeled amino acids was 
employed. The formation of thioesters is a crucial step in 
the investigation of the hybrid NRPSs used in this study, 
because this step directly involves substrate transfer 
between domains of different origin. Only a proper inter- 
action between the A and PGP domains ensures signifi- 
cant loading of substrate amino acids [29]. Previous 
heterologous loading experiments i» trans with dissected 
A and PGP domains were obviously hampered by poor 
domain-domain interactions, which revealed an impaired 
substrate thiolation [30]. 

As shown in Figure 4, in the case of the three investigated 
proteins I, 11 and III, a significant level of substrate trans- 
fer to the heterologous PGP domain was feasible. 



Figure 4 
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Substrate thiolation on PCP domains nnonitored by trichloroacetic acid (TCA)- precipitation assay. The schematic presentation of the hybrid NRPS 
shows which substrate amino acids are expected to be incorporated by the first and second modules, respectively (deduced from the specificity of 
adenylation domains). Kinetics for the thiolation of substrate amino acids in (a) I, (b) tl and (c) III. 
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Table 3 



Dipeptides formed by the hybrid NRPSs used in this study. 



Enzyme 


Product* 


Detection 


Velocity"^ 


1 
1 


He— Leu 


Yes 


0.3* 




iie~"iie 


Yes 


0.1 




Leu— Leu 




0.3 




Leu~IIe 






II 


Trp-Leu 


Yes 


2.1 




Phe-Leu 


Yes 


0.9 




Trp-lle 


Yes 


0.1* 




Phe-lle 


-§ 






9'-ThiRnvI-Ala— LpuH 

^ 1 1 iiC7l iji r\lcl 


Yes 


n.d. 




2'-Napthyl-Ala-Leu^ 


Yes 


n.d. 




5'-Hydroxy-Trp-Leu^ 


Yes 


n.d. 




Homo-Phe-Leu^ 


Yes 


n.d. 




D-Phe-Leu 






III 


lle-Phe 


Yes 


0.5 




lle-Trp 


Yes 


<0.02 




Leu-Phe 


Yes 


<0.02 




Leu-Trp 


Yes 


n.d. 




lle-2'-Thienyl-Ala^ 


Yes 


n.d. 




lle-2'-Napthyl-Ala'' 


Yes 


n.d. 




lle-5'-Hydroxy-Trp*^ 


Yes 


n.d. 




lle-Homo-Phe*' 


Yes 


n.d. 




lle-D-Phe*' 







'Yes' indicates that the dipeptide product was observed using HPLC 
MS and radioactive TLC. *lf not indicated otherwise, then both amino 
acids are in the L-configuration. "^in mol per min and mol of enzyme, 
n.d., not determined. *ln parallel lle-lle but not Leu-Leu is formed. 
*Di peptide products resulting from an inspecific enzyme reaction are 
also observed: lle-lle, lle-Trp and Trp-Trp. §Trace amounts of lle-lle 
are formed. Tl Analogous dipeptides with He in the acceptor position 
cannot be formed. "'Analogous dipeptides with Leu in the donor 
position cannot be formed. 

Analysis of protein I revealed fast incorporation of leucine 
and delayed incorporation of isoleucine (Figure 4a). 
Moreover, the kinetics of isoleucine incorporation might 
be a result of loading on both modules at the same time. 
Incorporation of valine (whose adenylation rate was 30% 
of the isoleucine and leucine rates) was very slow 
(Figure 4a) and may be caused by an uncatalyzed 
thioester formation. 

On the second module of II the incorporation rate of 
leucine was fast compared with isoleucine and valine 
(see Figure 4b, lower panel). The ratio of leucine, 
isoleucine and valine incorporation reflects the AlT-PPj 
activation pattern for these amino acids. Substrate thiola- 
tion on the first module was fast with tryptophan 
(Figure 4b, upper panel), whereas the incorporation of 
phenylalanine was slow and not completed within 
20 min of incubation. The low incorporation of tyrosine 
(see Figure 4b) again may result from an uncatalyzed 
transfer. In both proteins I and II, transfer of the noncog- 
nate amino acids isoleucine and tryptophan to a hybrid 
PCP domain that normally serves as a carrier for 
ornithine (TycC5; Figure la) was feasible. 



In the dimodular construct HI, PCP domains in both 
modules represent hybrid junctions: a hydrophobic amino 
acid (isoleucine) is to be loaded on a PCP domain naturally 
engaged in proline processing (first module) and a large 
aromatic amino acid (tryptophan) is to be loaded on a PCP 
domain originally associated with a leucine-activating 
module (second module; Figure la). As shown in 
Figure 4c, the incorporation rates of putative donor amino 
acids such as isoleucine, leucine and valine were slow and 
not completed within 20 minutes (upper panel). This slow 
incorporation on the first module had been observed with 
protein I (Figure 4a). Incorporation of the acceptor amino 
acids tryptophan, phenylalanine and tyrosine was relatively 
fast (Figure 4c), but tryptophan and phenylalanine reached 
high levels of incorporation, those of tyrosine stayed at a 
lower but significant level. The difference in loading of 
aromatic amino acids in III and 11 can be attributed to the 
altered substrate affinity of the TycB2 A domain. 

Using intramolecular fusions of A and PCP domains we 
have been able to demonstrate a proper thiolation reac- 
tion. The rules that govern domain-domain interactions 
between A and PCP domains are probably conserved 
irrespective of the cognate substrate. A fixed arrange- 
ment and proximity of the two domains, as provided by a 
short linker region, seems to be sufficient for productive 
communication [31]. We found no evidence to suggest 
that substrate specificity is influenced at the level of 
thioester formation. 

Product formation 

Peptide formation on dimodular NRPSs implies an elon- 
gation reaction catalyzed by a C domain and subsequent 
product release catalyzed by a 1 e domain. The essential 
role of the C domain for peptide-bond formation has been 
demonstrated previously using a truncated model system 
[6]. The role of the Te domain in releasing nonribosomal 
peptides from the enzyme has been underlined by hi vivo 
studies [18]. Te domains may also catalyze cyclization or 
multimerization of products [32]. A thioesterase with 
inherent substrate specificity was a topic of investigation 
for the Te domain in the erythromycin-producing polyke- 
tide synthase [33]. It has been shown recently that the 
thioesterase domain of NRPSs is sufficient for product 
release from artificial templates hi vitro [34]. 

There is little data available about the influence of the C 
domain on substrate specificity [35]. Recent studies 
suggest the C domain exerts more influence on the nature 
of the acceptor amino acid than the donor amino acid [20]. 

The dipeptide products of the three hybrid NRPSs pre- 
pared in this study were investigated. The structure of the 
predicted dipepddes should be dependant on the order 
and specificity of the two modules used. For product 
detection we used thin-layer chromatography (TLC) with 
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radiolabeled substrate amino acids (not shown), reverse- 
phase high-performance liquid chromatography (HPLC) 
and coupled HPLC-MS methods. Relevant compounds 
were compared with standard dipeptides. The results of 
dipeptide formation are summarized in Table 3. 

In the presence of both substrate amino acids, isoleucine 
and leucine, I produced the dipeptides Ile-Leu and Ile-Ile, 
but no Leu-Leu or Leu-Ile were formed (Figure 5). The 
ratio between Ile-Leu and Ile-Ile was about 5:1. When the 
enzyme was incubated with just leucine, Leu-Leu was 
detected. Larger amounts of Ile-Ile were produced in the 
absence of leucine. As predicted from the substrate speci- 
ficity of the first and second modules, Ile-Leu is the major 
product. The parallel formation of Ile-Ile may reflect the 
reduced isoleucine activation of the second module. Sur- 
prising is the high level of Leu-Leu formed in the absence 
of isoleucine — when isoleucine is present, no Leu-Leu is 
formed. The processing of leucine is probably affected 
when the more favored donor amino acid isoleucine is 
present. Regarding the absence of Leu-Leu formation and 
the reduced level of Ile-Ile in the presence of both sub- 
strate amino acids isoleucine and leucine, it is clear that the 
dipeptide Leu-lle is not formed. Kinetic parameters for 

Figure 5 



dipeptide formation revealed a turnover of 0.3 min"^ for 
Ile-Leu and Leu-Leu, in contrast to Ile-Ile, which was 
formed with a turnover of 0,1 min-^ (Table 3). 

Hybrid enzyme II was shown to produce the dipeptides 
Trp-Leu, Phe-Leu and Trp-Ile when incubated with the 
relevant substrate amino acids (Figure 6). Phe-Ile was not 
detected. The formation of Trp-Leu was preferred and 
reached a turnover rate of 2.1 min"*. Substitution of the 
donor amino acid tryptophan with phenylalanine and the 
acceptor amino acid leucine with isoleucine reduced the 
turnover rates to 0.9 min'* and 0.1 min-^ respectively (see 
Table 3). The reduced level of Phe-Leu production is 
consistent with the lower adenylation and thiolation rates 
for phenylalanine compared with tryptophan. When the 
enzyme was incubated with tryptophan and isoleucine, 
other dipeptides resulting from unspecific reactions such as 
Ile-Ile, Trp-Trp and Ile-Trp were observed to a smaller 
degree than Trp-Ile (not shown). This might be explained 
in the following way: when elongation of the thioesterified 
amino acid precursors runs too slowly, competitive captur- 
ing of other amino acids in solution becomes more promi- 
nent [36]. A similar observation was made when 
phenylalanine and isoleucine were incubated: instead of 




Dipeptides formed by the hybrid enzyme I 
detected by reverse-phase HPLC MS analysis. 
<a) HPLC MS diagrams: in the presence of 
substrate amino acids isoleucine and leucine 
(red) the dipeptides Ile-Leu and (to a lesser 
extent) He-He are formed. Incubation with 
isoleucine (blue) or leucine (green) leads to 
formation of Ile-Ile and Leu-Leu, respectively. 
<b) Mass spectra of detected products: Ile-Ile, 
Ile-Leu and Leu-Leu. 
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Figure 6 




Dipeptides formed by the hybrid enzyme II detected by reverse-phase HPLC MS analysis, (a) HPLC MS diagrams of Phe-Leu (blue) and Trp-Leu 
(red) formation, (b) Mass spectra of detected products: Phe-Leu and Trp-Leu. 



the predicted dipeptide Phe-IIe being synthesized, Ile-lle 
was formed (data not shown). 

Nonproteinogenic amino acids resembling tryptophan and 
phenylalanine were also processed into dipeptides, but 
D-Phe was not a substrate (Table 3). As shown in Figure 7 
the only dipeptide formed to a considerable degree by 
hybrid enzyme III was Ile-Phc, with a velocity of 
0.5 min-* (Table 3). Hybrid enzyme III therefore dis- 
played the highest selectivity. Substitution of the donor 
amino acid isoleucine with leucine and the acceptor amino 
acid phenylalanine with tryptophan strongly lowered 
dipeptide formation (less than 0.02 min-^ Table 3). No 
Leu-Trp was detectable. The system also did not tolerate 
a D-configured phenylalanine. 

Synthetic templates 

The formation of dipeptides by a dimodular NRPS is a 
multistep process [1] that involves activation of substrate 
amino acids to adenylates and subsequent thiolation of the 
activated precursors to enzyme-bound intermediates, fol- 
lowed by peptide-bond formation. The Te domain com- 
pletes the process by releasing the product. To monitor 
single reactions in the progress of dipeptide formation we 
used an //; vitro system. 



Activation: ATP-PP, exchange reaction 

The A domain is the specificity-conferring domain in each 
NRPS module [23], and therefore dictates the primary 
structure of nonribosomal pepddes. It is for this reason A 
domains have been studied in the greatest detail. A 
domains were the first domain type of NRPSs to be inves- 
tigated as recombinant proteins [37]. They are 
autonomous domains that retain their catalytic activity 
when overproduced as distinct proteins [38]. The amino- 
acid specificity of these recombinant domains was found to 
accurately match the predictions of in vivo studies [19]. By 
employing A domains artificially linked to PCP domains 
we were able to investigate the influence of neighboring 
domains on substrate specificity and affinity. We found 
that the selectivity and affinity of the TycB2 A domain 
was affected by the surrounding domain arrangement in 
the dimodular hybrid enzymes II and HI. The latter was 
found to have a reduced selectivity towards homologous 
nonproteinogenic substrates (Figure 3b,c; Table 2). 

The importance of conformational changes in multi- 
domain modules during the process of amino-acid adeny- 
lation and product release has been discussed previously 
[29]. It is possible that the introduction of fusion sites 
could impair these conformational changes. 
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Because enzyme 11 is able to catalyze dipeptide formation, 
we have essentially demonstrated that the internal elonga- 
tion module TycB2 can be converted into an initiation 
module simply by repositioning (and deleting of the 
C domain). No other information appears to be required. 

Thiolation: TCA-precipitation assay 

In the dimodular NRPSs examined in this study, the trans- 
fer of adenylated amino acid precursors from the A domain 
to the PCP domain involved an engineered junction. Het- 
erologous substrate loading on fused A and PCP domains 
was shown to be feasible. Two conclusions can be drawn 
from this finding. First, the domain-domain interaction 
between the A and PCP domains is maintained sufficiently 
when fused together via a short linker region. Second, PCP 
domains exhibit no obvious specificity towards the amino 
acid to be loaded. In principle, thiolation was found to be 
dictated by the specificity of the preceding A domain. A pos- 
sible function of PGP domain specificity could have been 
interpreted as a sort of proof-reading function of the NRPS. 

Because the ATP-PP; exchange reaction monitors the 
reverse reaction of substrate activation (the breakdown of 
adenylates to PPj and amino acid) this assay is not a direct 
measurement of adenylate formation. It has been pro- 
posed that less-stable trapped adenylates are more suscep- 
tible to hydrolysis, and that the stability of adenylate 
intermediates may therefore contribute to substrate selec- 
tivity of downstream processes including substrate thiola- 
tion [24]. Following on from this idea, differences in the 
thiolation of aromatic amino acids between the hybrid 
enzymes II and III (Figure 4b,c; Table 2) can be 
explained by an altered kinetic pattern of the preceding A 
domain rather than by selectivity of the PCP domain. 
Vke versa, the downstream process of the thiolation reac- 
tion could influence the upstream A domain reaction by 
inducing the capturing of substrate adenylates. 

Elongation and product release 

There is no direct assay available to follow the elongation 
reaction. The formation of dipeptides and their subse- 
quent release from the hybrid enzymes used in this study 
can be monitored using TCA-precipitation kinetics as 
described in the Materials and methods section. 

TCA-precipitation kinetics were performed III, which is a 
highly specific template that allows only lle-Phe to be 
formed in significant amounts (Table 3). Substrate adeny- 
lation and thiolation also predicted the formation of dipep- 
tides with leucine in the donor position and tryptophan in 
the acceptor position (Figures 3c and 4c). The kinetics 
monitoring the incorporation level of ['"^Cl-labeled 
isoleucine are shown in P'igure 8. The acceptor amino 
acids in this reaction can be categorized in three groups. 
First, phenylalanine, 2'-Thienyl-Ala and Homo-Phe 
induce an immediate decline in isoleucine incorporation. 



Figure 7 



Dipeptides formed by III detected using reverse-phase HPLC MS 
analysis, (a) HPLC MS diagrams of lle-Phe (red), Leu-Phe (blue) and 
Ile-Trp (green) formation, (b) Mass spectra of detected products: 
lle-Phe, Leu-Phe and Ile-Trp. 



The expectafion is that the elongated dipeptides are 
readily cleaved by the action of the Te domain. The 
second group includes acceptor amino acids tryptophan 
and 2'-Napthyl-Ala, which induce a slight increase in TCA 
precipitate followed by a slow decline. Because no dou- 
bling in isoleucine labeling is observed and the incorpora- 
tion level remains high for more than 20 minutes we 
conclude that both condensation and termination reac- 
tions are slow. The third group is represented by D-Phe, 
an acceptor amino acid that leads to a constant incorpora- 
tion level. Here the condensation reaction seems to be 
blocked. These data are consistent with the observation 
(Figure 7 and Table 3) that only lle-Phe, lle-2'-Thienyl- 
Ala and Ile-Homo-Phe are formed in significant amounts. 
In contrast, Ile-Trp and Ile-2'-Napthyl-A!a are formed 
only very slowly (0.02 min-'; Table 3), and Ile-D-Phe is 
not detectable. 

All three dimodular hybrid NRPSs presented in this study 
employ C domains fused to noncognate A domains in the 
donor position. None of these artificial NRPSs were 
affected in dipeptide formation, which supports the model 
that the donor position of C domains has little or no infiu- 
ence on substrate selectivity (Table 3). 
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Figure 8 
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Kinetics of amino acid thiolation (isoleucine) on the first module of III in 
the presence of excess amounts of amino acids thioesterified on the 
second module. Phenylalanine, 2'-Thienyl-Ala and Homo-Phe induce 
an immediate decline in TCA-precipitable isoleucine label, due to 
formation and release of the dipeptides. Dipeptide formation with 
tryptophan and 2'-Napthyl-Ala is delayed resulting in a slight increase 
in TCA-precipitable isoleucine label (reloading of the donor position). 
With D-Phe as a donor amino acid, a constant level of isoleucine 
thiolation is observed, indicating that condensation to the lle-o-Phe 
dipeptide does not occur. The level of isoleucine thiolation for the first 
30 min is represented in an idealized shape. 



In contrast, the flexibility of the C domain to elongate 
noncognate amino acids was restricted to isoleucine — 
the structurally related amino acid leucine was not used. 
The G domain of TycB2 in the acceptor position was 
found to be selective for the cognate amino acid pheny- 
lalanine rather than tryptophan (Table 3). This explains 
the substitution of phenylalanine to tryptophan at position 
3 in the cyclic decapeptide tyrocidine [39,40], even 
though the substrate adenylation favors tryptophan. The 
observation that C domains exhibit a selectivity towards 
the acceptor amino acid and not towards the donor amino 
acid is consistent with recent results [20]. 

Significance 

Many medically and agriculturally important peptides 
are synthesized nonribosomally by large, multienzyme 
complexes called nonribosomal peptide synthetases 
(NRPSs). Their modular nature has attracted attention 
because of the potential for designing, engineering and 
generating novel peptides that might be useful as drugs. 
We have demonstrated here that a set of designed 
dimodular hybrid NRPSs catalyzes formation of the 



predicted dipeptides as defined by the specificity of the 
employed adenylation (A) domains. The de novo design 
of dipeptides with a given sequence was accomplished by 
fusing various A and thiolation or peptidyl-carrier 
protein (PCP) domains. The identification of a potential 
interdomain linker region uncovered the suitable fusion 
site. The results show that the interaction between 
foreign A and PCP domains is not impaired, allowing 
heterologous substrate thiolation. We found that there is 
a considerable degree of tolerance towards noncognate 
substrates for the condensation reaction at the donor 
position, in contrast to the reaction at the acceptor posi- 
tion, where there appears to be a size-exclusion selectiv- 
ity. In addition, we demonstrated that a native 
elongation module can be converted into a initiation 
module by repositioning. The efficiency of the hybrid 
NRPSs to generate dipeptides in vitro was not impaired 
compared with that of native system. The results char- 
acterize NRPSs as a versatile tool for generating bio- 
engineered peptides. 

Materials and methods 

Bacterial strains and growth conditions 

E. coli XL1 Blue (Stratagene, Heidelberg, Germany) was used for 
preparation of recombinant plasnnids. Overproduction of recombi- 
nant proteins was carried out in E coli BL21 {XDE3)/pgsp using 
standard protocols [41]. 

Definition of domains and modules 

In this study we have defined an NRPS module as an arrangement of a 
C, A and PCP (or T) domains (e.g. CAT). Initiation modules do not 
contain a C domain (i.e. AT). A dimodular NRPS that initiates and termi- 
nates dipeptide formation has the domain organization ATCATTe. In 
this study we use, for the first time, an intramodular A-PCP domain 
fusion and the domain arrangement to be re-positioned is PCP (or T), 
C and A domains. 

Identification of a potential interdomain linker between A and 
PCP domains 

Sequence alignments of the potential boundary region of A and PCP 
domains of about 50 NRPS modules from Bacilli were performed using 
the ClustalW program. These alignments revealed a stretch of nine 
amino acids not highly conserved between 38 and 46 residues amino- 
terminal of the PCP domain core motif LGGHS. This nine-residue site 
is believed to function as a linker region connecting the NRPS's A and 
PCP domains. For hybrid fusions at positions 7 and 8 of the nine amino 
acid stretch, the sequence was altered to Leu -Gin and Val-Asn by 
introducing Psrt and Hpa\ restriction sites into the plasmid (Rgure 2). 

Cloning of hybrid dimodular NRPS genes 
Rasmid p[A,J(i„^i)-[TCAL3^TTe](^^, is a derivative of p[A,J(^^i)- 
\J^{tycA) based on pQE60 (Qiagen). The latter contains the A domain of 
bacAl fused to the PCP and epimerization domains of tycA by an artifi- 
cial Pst\ site (S.D. and MAM., unpublished observations). A 1605 bp 
PGR product containing DNA encoding for the 6acAl A domain was 
obtained using chromosomal DNA from B. licheniformis ATCC10716 
and the following oligonucleotides: 5'-TTTCCATGGTTGCTAAA- 
CATTCATTAGA-3' and 5'-TTCCTGCyiGCGCCCCCGCCGTTCTG-3' 
(italic, modified sequences; bold, restriction site). A PGR product con- 
taining DNA coding for the domain organization TCATTe from the tyroci- 
dine synthetase gene tycCQ was obtained using chromosomal DNA 
from a brevis ATCC8185 and the following oligonucleotides: 
5'-ArAC7GC4GGAGTATGTAGCGCCGC-3' and 5'-/AT^GG>4rCCTTT- 
CAGGATGAACAGTTCTTG-3'. After digestion of the product with Pst\ 
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and BamHl the 41 25 bp DNA fragment was cloned into pfi^(bBcAn~'^y^^ 
digested with Pst\ and BamH\, to remove the 3' terminal tycA DNA, yield- 
ing PlA|iJ(6acAi)-ITCAL3uTTe](/^cC6)- 

p[AphJ(/ycB2)-n'CALeuTTe](^) is a derivative of p[Aj(^,^i)- 
n"CALeuTTe](^^6). A PGR product containing DNA encoding the tycB2 
A domain was obtained using chromosomal DNA from B. brevis 
ATCC8185 and the following oligonucleotides: b'-AATCCAlGGT- 
GACTGCGCATGAG-3' and 5'-A>^rCrGC4GTGTTGCAGGGTTTC- 
CTTGC-3', After digestion with Pst\ and Nco\ the 1563 bp DNA 
fragment was cloned into p[A,J(i„(^i)-[TCALeijTTe](^^6) digested with 
the same enzymes to remove the bacA DNA, yielding p[AphJ(^2)~ 
[TCAL,,TTe](^6). 

p[^J(tacAi)-n"CApJ(^2,-ITTe](^) is a derivative of p[A 

llJ(bacA1)~ 

n"CAt^TTe](^ycC6)- The latter was reamplified using inverse PGR tech- 
niques that lead to the exclusion of parts of the second module 
([TCAp^J(^^2))- ^® following primers were used: 5'-AGGG7TA4C- 
GAATACGTGGCCCCGAG-3' and 5'->M7G7TA4CCTCCTGCAGCG- 
GCCC-3'. The DNA product was digested using the restriction enzyme 
Hpal and religated yielding the 601 8 bp plasmid plA,|J(bacAi)-|TTe](^^). 
A PGR product containing DNA coding for parts of the tyc32 module 
(domain organization TGA) was obtained using chromosomal DNA from 
a brevis ATCCB185 and the following oligonucleotides: 5'-ACG- 
CTGC4GGATTACGTCGCCCCGA-3' and 5'-AGGG7TA4CTGTTGCA- 
GGCTTTCCTTC-3'. The 31 1 1 bp DNA fragment was digested with Psfl 
and Hpal and cloned into Pst\- and Hpal-digested p[A|J(i^i)- 
rn'e](^) to yield p[A,ie](b«:Ai)-n*CAphJ(^2)-nTe](^^). 

Overproduction and purification of dimodular hybrid NRPSs 
E. coli BL21(XDE3)/psfsp was transfomned with the plasmids 
described above, and the constructs expressed as described previ- 
ously [6]. pgrsp is a derivative of pREP4 carrying the 4'phosphopan- 
tetheinyl transferase gene igsp) under the control of a T7 promoter. 
Coexpression with gsp allows NRPS holo-enzymes to be produced 
in vivo by modification with GoA. Cells were induced with 0.2 mM 
IPTG at Agfjo 0.7 and allowed to grow for an additional 1.5 h before 
being han/ested. Purification of the {H/s6)-tagged proteins was basi- 
cally carried out as described previously [6], Overproduction and purifi- 
cation was analyzed with Goomassie Brilliant Blue stained SDS 
polyacrylamide gels [42]. Protein concentration was measured using 
the Bradford method [43]. Proteins could be stored on ice for at least a 
week with no obsen/able loss of activity. 

ATP-PPj exc flange reaction 

The ATP-PP| exchange reaction that monitors formation of adenylates 
was carried out as described previously [6]. The enzyme concentration 
was 20 pmol in a total volume of 1 00 Substrate amino acids were 
used at a final concentration of 1 mM. To measure kinetic constants, 
time courses of the initial velocity of ATP-PP, exchange were performed 
at varying amino acid concentrations. The ATP concentration was 5 mM. 
Kinetic constants were determined using Lineweaver-Burk graphs. 

Trichloroacetic acid precipitation assay 

The TCA precipitation assay used to monitor substrate thiolation was 
carried out as described previously [6]. The enzyme concentration was 
50 pmol in a total volume of 1 00 ^1. Substrate amino acids were gener- 
ally used in a threefold excess to enzyme concentration. For kinetics 
according to Figure 9, 50 pmol enzyme III was incubated at 37°G for 
30min with 150 pmol [^^^G] -labeled isoleucine (substrate of the first 
module) in a total volume of 1 00 A solution of 1 mM nonlabeled sub- 
strate amino acid of the second module in a total volume of 50 ^l was 
added. Incorporation level of labeled isoleucine was followed by taking 
samples at distinct times according to the TCA precipitation assay. 

During elongation in III, isoleucine is loaded onto the PGP domain of 
module 2. Reloading of the module 1 PGP domain therefore results in 
a 100% increase in isoleucine incorporation. If elongation and termina- 



tion of processed dipeptides occur simultaneously, an immediate 
decline in isoleucine incorporation is observed. 

Analogous kinetics were also performed on II, underiining that the 
protein elongates and terminates both dipeptides with tryptophan and 
phenylalanine in the donor position and leucine and isoleucine in the 
acceptor position (data not shown). For protein I, the assay was hin- 
dered by the fact that substrate amino acids are thioesterified on both 
modules so that dipeptides are formed immediately (data not shown). 

Dipeptide formation 

Assays for dipeptide formation were carried out in a total volume of 
1 00 \i\ containing 50 pmol enzyme, 1 mM amino acids, 2 mM ATP in 
a solution of 50 mM HEPES and 20 mM MgCls- Negative controls 
were assayed with no ATP or with only one of the two amino acids. 
Assays were incubated for 3 h at 37'C before the reaction was 
quenched by adding 100|il n-butanol followed by rigorous shaking. 
The addition of butanol led to precipitation of proteins that were then 
removed with a pipette tip. The whole mixture was evaporated in a 
speed-vac and resuspended in 100|il 10% methanol before being 
used for HPLC analysis. 

To determine product turnover of the enzymes, samples were taken at 
distinct times and prepared in the same way as described above. The 
amount of product formed was compared to a quantified standard solu- 
tion of the purchased dipeptide on HPLC and HPLC MS carried out on 
a Hewlett Packard Series 1100 MSD. We used a Nucleosil 120-3 
Cl 8 reverse phase column. 

Radioisotopes and chemicals 

Tetrasodium [32p]-pyrophosphate was purchased from NEN Life 
Science Products. [U-i*C]-isoleucine (260Ci/mol, lOO^O/ml), 
[U-i^Gl-leucine (292a/mol, lOO^O/ml), [U-^'^Cj-valine (200Ci/mol, 
100|ia/ml), [U-i'*C]-phenylalanine (450Ci/mol, lOO^a/ml), [U-^^C] 
tyrosine (497 Gi/mol, 100^Gi/ml) and [5-3 H] -tryptophan (22.1 C/mmol, 
1 nO/ml) were purchased from Hartmann Analytik. Authentic dipeptides 
used as standards were purchased from Bachem or Sigma, respectively. 

Supplementary material 

Supplementary material including SDS polyacrylamide gels of overpro- 
duced dimodular hybrid NRPSs is available at http://current- 
biology.com/supmat/supmatin.htm. 
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